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A class of nonlinear ARCH processes is introduced and studied. 
The existence of a strictly stationary and /3-mixing solution is es- 
tablished under a mild assumption on the density of the underlying 
independent process. We give sufficient conditions for the existence 
of moments. The analysis relies on Markov chain theory. The model 
generalizes some important features of standard ARCH models and 
is amenable to further analysis. 

1. Introduction. Since the appearance of seminal papers by Engle [9] 
and Bollerslv [2], a variety of GARCH (generalized autoregressive condi- 
tionally heteroskedastic) specifications have been introduced to model the 
characteristic features of observed financial time series. These specifications 
are of the form 

(1.1) E t = a tT ] U feZ, 

where the sequence (rjt) is independent and identically distributed (i.i.d.) 
with zero mean and unit variance, and at is a positive variable called volatil- 
ity, which is a measurable function of the past, {et-i,i > 0}. Typically, e% 
represents the logarithm of the return, that is, the variation of the price in 
logarithm. 

The original model specified cr| as a linear function of the squared past 
log-returns and was found adequate to capture many stylized facts associated 
with the financial data, namely tail heaviness, volatility clustering, leptokur- 
tosis of the marginal distribution and dependence without autocorrelation. 
Other characteristic properties such as asymmetries motivated extensions of 
the basic model (see, e.g., [15, 20]). A common feature of these models is 
that at is specified as a strictly increasing function of the modulus of the 
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past returns. In general, the specification of at involves a linear combination 
of some function of the past returns. 

In this paper, we consider a class of nonlinear ARCH processes. More 
precisely, the model we study in this paper is given by 

£t = o- t rjt, 

(1.2) 

of = u + ae 2 t _ x t e 2_^ >k£ 2_^ 

where lj, a and k are nonnegative constants with uj > and where the same 
assumptions are made concerning (rjt) as in (1.1). The standard ARCH(l) 
model is obtained as a particular case by taking k = 0. The conditionally 
homoskedastic model (constant volatility) can be obtained by setting a = 0, 
but it is worthnoting that "large" values of k also produce a model which is 
close to being homoskedastic. This model belongs to the class of endogenous 
switching regime models, in the spirit of the threshold autoregressive models 
of Tong and Lim [19]. In the present model, the volatility equation can be 
interpreted as a two-regime specification, the first regime being homoskedas- 
tic (of = uj) and the second one being a classical ARCH(l) (of = uj + ae 2 _i). 
The originality of the specification, however, is that the regime change de- 
pends on the relative variation of the last squared observation. As soon 
as the relative variations (ef_ 1 /ej_ 2 ) are small, the process remains in the 
homoskedastic regime. But, when these variations are large, the volatility de- 
pends on the last squared observation. The coefficient k allows for flexibility 
in the occurrence of the two regimes. Empirical motivations for model (1.2) 
based on the features of real financial time series can be found in the disser- 
tation by Saidi [17]. 

The aim of this paper is to study the stability properties of the specifica- 
tion in (1.2). Recent references dealing with ergodic properties of GARCH- 
type models are [1, 5, 10, 11]. These papers use a random coefficient linear 
representation of the volatility, of the form of = uj(rj t -i) + a(r)t-i)o~ 2 _i in the 
first-order case, which does not hold in our framework. A different approach 
is used by Cline and Pu [7] who establish sharp conditions for geometric 
ergodicity of a class of threshold autoregressive ARCH models under as- 
sumptions we will discuss further. 

The rest of the paper proceeds as follows. In Section 2, we recall the 
main results of Markov chain theory that we will use in the sequel. Section 
3 is devoted to the existence of strictly stationary solutions. We start by 
considering the deterministic model implied by (1.2). Then we establish 
conditions for the existence of strictly stationary and (3- mixing solutions. 
Finally, we provide conditions for the existence of moments. 
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2. Some Markov chain results. In this section, we give results from the 
theory of Markov chain processes that allow to study the existence of ergodic 
solutions to stochastic difference equations. This section is heavily based on 
the book by Meyn and Tweedie [13]. Let E C R d and let £ be the Borel o- 
field on E. We denote by {Xt, t > 0} a homogeneous Markov chain on (E, £) 
and denote by P t (x, B) = T(X t £ B\Xq = x) the probability of moving from 
x £ E to the set B £ £ in t steps. The Markov chain (Xt) is ^-irreducible if, 
for some nontrivial cr-finite measure (p on (E,£), 

VBe£ <p(B)>0 VxeE, 3t>0, P t (x,B)>0. 

If (Xt) is 0-ir reducible, there exists a maximal irreducibility measure M (see 
[13], Proposition 4.2.2) and we set £ + = {B £ £\M(B) > 0}. The chain is 
called positive recurrent if 

Vx££,VBe£ + UmsupP t (x,B)>0. 

t— >oo 

For a ^-irreducible Markov chain, positive recurrence is equivalent (see [13], 
Theorem 18.2.2) to the existence of a (unique) invariant distribution, that 
is, a probability measure ir such that 

VBGS it(B) = Jp(x, B)n(dx). 

"Geometric ergodicity" refers to the rate of convergence of the transition 
probabilities to the invariant distribution. More precisely, if || • || denotes 
the total variation norm, the Markov chain (Xt) is said to be geometrically 
ergodic if there exists a p, p £ (0, 1) such that 

(2.1) Vx€£ p-* ||P*(av) -tt||->0 as t -> +oo. 

In order to state the following criterion for the geometric ergodicity of a 
Markov chain, we need the notions of T-chain, small sets and aperiodicity. 
For any distribution a = (a n ) on the set of positive integers, for all x £ E 
and B £ £, let K a (x,B) = ^2 n>l a n P n (x,B). Recall that if E is endowed 
with a metric, a function h: E — > M is called lower semicontinuous if for 
any constant c, the set {x:h(x) > c} is open. Now, if for any open set B, 
the function P(-,B) is lower semicontinuous, (Xt) is called a Feller Markov 
chain. More generally, if there exists a function T : E x £ —> [0, +oo) and 
a distribution a = (a n ) on the set of positive integers such that (i) T(-,B) 
is lower semicontinuous, MB £ £, (ii) T(x, ■) is a nontrivial measure over 
(E,£), Vx £ E and (iii) K a (x,B) > T(a?,B),Vx £ £ f, then (X t ) is 
called a T-chain and T is called a continuous component of A" a . A set C £ £ 
is called a v m -small set if there exist an m > and a nontrivial measure 
u m on £ such that Vx £ C and V£ £ 5, P m (x,B) > u m (B). Let C be a 
z^M-small set where the measure vm '■= v is such that v(C) > 0. Such a 



4 



Y. SAIDI AND J.-M. ZAKOIAN 



measure exists whenever C € £ + (see [13], Proposition 5.2.4). Let Eq = 
{m > 1\C is z/ m -small with u m = 5 m u for some 5 m > 0}. Then if (Xt) is a </>- 
irreducible Markov chain and C G B + , the greatest common divisor d of the 
set Ec does not depend on C and is called the period of the Markov chain. If 
d = l, (Xt) is said to be aperiodic. If every compact set is small, then (Xt) is 
a T-chain. If (Xt) is a (^-irreducible T-chain, then every compact set is small 
(see [13], Proposition 5.5.7 and Theorem 6.2.5). However, some noncompact 
sets may also be small, and such sets can be worth considering, as we shall 
see. 

We are now in a position to state a criterion for geometric ergodicity based 
on m-step transitions, which is adapted from [13], Theorem 19.1.3. The use 
of ?n-step transitions in ergodicity criteria was suggested by Tj0stheim [18]. 

Theorem 2.1. Assume that: 

(i) (Xt) is (ft- irreducible for some measure (ft on (E,£), 

(ii) (Xt) is an aperiodic T-chain, 

(iii) there exists a small set C G £ + , an integer m > 1 and a nonnegative 
continuous function (test function) g:E^> [0, +oo) such that 

E[ 9 (X t+m )\Xt = X ]<{^-^-^ > 

for some strictly positive constants (3 and b. Then (Xt) is geometrically 
ergodic. Moreover, E n g(Xt) is finite, where E n denotes expectation taken 
under the stationary distribution. 

One consequence of the geometric ergodicity is that the Markov chain 
(Xf) is (5-mixing, and hence strongly mixing, with geometric rate. Recall 
that for a stationary process, the /3-mixing coefficients are defined by 

(2.2) !3 x (k)=E sup \F(B\a(X s ,s<0))-F(B)\. 

Be<r(X B ,s>k) 

The process is called /3-mixing if linn^oo 0x(k) =0. If Y = (Y t ) is a process 
such that Yt = f(Xt, . . . ,Xt- r ) for some measurable function / and some 
integer r > 0, then a(Y t ,t < s) C a(X t ,t < s) and <r(Yj,t > s) C a(X t - r ,t > 
s). Thus, 

(2.3) (3 Y (k) <f3 x (k- r) for all k>r. 

Davydov [8] showed that for an ergodic Markov chain (Xt) with invariant 
probability measure tt, 

x (k) = J \\P h (x,-) -n\\Tv(dx). 

Noting that in (2.1), the rate p can be chosen independently of the initial 
point x, it follows that (3 x (k) = 0(p k ) if (2.1) holds. 
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3. Existence of stationary solutions. In this section, we consider the 
problem of the existence of strictly stationary and second-order stationary 
solutions to model (1.2). The problem is not standard because, contrary to 
most ARCH-type specifications, no linear representation of the model seems 
to exist. Hence, we cannot rely on the theory developed in the papers by 
Bougerol and Picard [3, 4]. Instead, we will use the techniques of Tweedie 
to deal with the stationarity question. 

Thinking of the standard ARCH(l) model, we could perhaps expect to 
require a (strict and second-order) stationarity condition of the form a < 1 . 
The presence of the (conditionally) homoskedastic regime seems to allow 
us greater freedom. As for the threshold autoregressive models, it will be 
helpful to first consider the deterministic model. 

3.1. Stability of the deterministic model. Suppose that, in model (1.2), 
the i.i.d. process (r/t) is such that rfi = 1, for all t, almost surely. We call this 
model deterministic, although the sign of St is, of course, a random variable. 
For ease of exposition, we take Eq = 0, but any other initial value would also 
produce the following asymptotic results: 

Theorem 3.1. Let (et)t>o be as defined in (1.2), with rfc = 1, a.s. for 
all t, and Eq = 0. Then: 

(i) if max(a, 1) < k or a = 0, then there exists i > 3 such that Vt>i, 
e\ = uj a.s.; 

(ii) if a < 1 and k < 1, then e\ — > a.s. when t — > +oo; 

(iii) if a > max(l, k), then e\ — > +oo a.s. whent^+oo. 

Proof. We have, a.s., e\ = u and e\ = m(l + a). The value of e| depends 
on the position of 1 + a compared to k. Let, for all i > 0, 



Since a > 0, the sets Ei constitute an increasing sequence. We have 



= {max(a,l) < fc}l){(0,l)}. 

Let us consider the different cases. 

Case (i). We have (a, k) € Eqq, hence there exists i > such that (a, k) € 
E{. Let io = min{f > 0, (a, k) G Ei}. For 1 < i < iq + 2, we have e\ = w(l + 
ha i_1 ). Then 




£oo := |J Ei = {a = 0, k > 1} U {0 < a < 1< k} U {1 < a < k} 



i>0 
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It follows that £■„,, = u> and, since l 0+3 = -n — — — — r-rr < 1 < k, that ef = u> 
for all i > io + 3. 

Case (ii). If (a,k) + (0,1), then (a.jfe) £ E^. Thus, for all i > 1, e? = 
u(l + • • • + a i_1 ) and the result follows. When (a, fc) = (0, 1), the sequence 
(ef ) takes the constant value u. 

Case (iii). We have (a, k) £ E^. Thus, for all i > 1, ef = w(l + • • • + a i_1 ) 
and the sequence (e 2 ) tends to +oo. □ 

From this result, the region of nonexplosion of the deterministic models 
is given by a < max(l, k). We now turn to the general case. 

3.2. Markov chain results. As with many discrete-time models, the anal- 
ysis of the probability structure of model (1.2) draws on Markov chain re- 
sults. Let 



' £ t-lJ \X 2 j 

and let 

VxGM 2 ip(x) = uj + axit xi>kx2 . 

The vector representation of model (1.2) takes the form of a nonlinear 
stochastic difference equation, 

(3.1) X t =^^):=F{X^\ *>1, 

where the i.i.d. sequence (rjt) is supposed to be independent of the initial 
state Xq. Note that models of the form (3.1) are considered, among others, 
by [13], Chapter 7, but under a smoothness assumption on the function F 
which is not valid in our framework. Let A+ be the Lebesgue measure and 
let B(M. +m ) be the Borel class of sets for M +m . We will make the following 
assumption: 

Assumption A. The variables r/ 2 admit a density / with respect to \ f , 
with / > on K + . Moreover, Er\t = and Erft = 1. 

Lemma 1. The process (Xt)t>o is a time-homogeneous Markov chain on 
]R +2 , with transition probabilities given as follows: 

Vx = (x 1 ,x 2 ) el +2 ,VB = Bi xB 2 eB{R +2 ). 

(3.2) P(x,B) = P[r l ^e^(x)- 1 B 1 ]l xieB2 . 

Moreover, under Assumption A, the process (X t ) is -irreducible (X 2 is 
therefore a maximal irreducibility measure). 
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PROOF. Equation (3.1) ensures that (Xf) is a time-homogeneous Markov 
chain. The two-step transition probabilities are given as follows: 

\/B = B 1 x B 2 eB(l +2 ),Vxet 2 

P 2 (x,B) = P[V(Xi_!)r/ t 2 G B^^vti e B 2 \X t - 2 = x] 
(3.3) = P[${il>{x)<rft_ x ,xi}r$ G B u G B 3 ] 

= | l 0( , ) - 1B2 (y)P[ ??f 2 G^{V'(x)y,x 1 }- 1 J B 1 ]/(y) f iA+(y). 

This can be seen by using the Fubini theorem, using the independence be- 
tween r\ t and i]t-i and noting that ip(-) > uj > 0. If \f(Bi) > 0, we have, for 
all y G M + , P[?7 2 G ^{ijj(x)y, x\} ~ 1 B\] > 0, in view of Assumption A. Simi- 
larly, Xf{ip(x)~ 1 B 2 } > if Xi(B 2 ) > 0. Hence, P 2 (x,i?) > 0, which ensures 
that is an irreducibility measure. We have P l {x,B) = for any Borel set 
B C (IR - ) 2 , any t > and any i£K 2 . Thus, any irreducibility measure <p 
is such that 4>{B) = for any B G £>(M~ 2 ). It follows that A^" is a maximal 
irreducibility measure (see [13], Proposition 4.2.2). □ 

Remark 1. Cline and Pu [6] provide conditions for irreducibility (as 
well as aperiodicity and the T-chain property) for a general class of nonlin- 
ear autoregressive models encompassing (1.2). Since we use slightly weaker 
conditions for the error density, we give direct proofs of the corresponding 
lemmas. 

Remark 2. The transition probability defined in (3.2) is a function of 
x which is not lower semicontinuous for any open set B. To see this, let 
x\ = kx 2 , let B = B\ x B 2 be an open set such that p = P[r)f G u>~ l B\\ > 
P[r]t G (oj + axij^Bi] = q and such that x\ G B 2 . For x = (x±,x 2 ), we have 
P[rjt G ip(x)~ l B\] > c = (jp + q)/2. Any neighborhood of x contains points 
V = (2/1)2/2) with yi > ky 2 . For such points, we have tp(y) = uj + ayi and thus, 
if yi is sufficiently close to x\, P[ry 2 G ij){y)~ l Bi\ < c. The set {x : P(x, B) > 
c} is therefore not open. It follows that (Xt) is not a Feller chain. The fact 
that compact sets are small, which will be used in the verification of our 
ergodicity criterion, is thus not straightforward. This property will follow 
from the next result. 

Lemma 2. Under the assumptions of Lemma 1, the process (Xt) is a 
T-chain. 

Proof. It will be convenient to consider a partition of the positive 
quadrant of M 2 into three regions: D\ = {x\ < kx 2 }, D 2 = {x\ = kx 2 } and 
D 3 = {xi > kx 2 }. 
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For x G D\ U P 2 , we have ip{x) = u. Thus, from (3.3), using the Fubini 
theorem and the independence between rjt and rjt-i, we have 

VxeD 1 UD 2 ,\/B = B 1 xB 2 eB(E+ 2 ) 
P 2 (x,P) = P[i/>(w$-i,xi)rg G Pi, ^ 2 _i G B 2 ] 

= J K-T-B 2 n(-oo,<M-ikx 1 ](y) p [vt e dAi" (y) 

+ J ^-iB 2 n(^kx 1 ,+oo)(y)P[vl e{iv(l + ay)}~ 1 B 1 ]f(y)dXf(y). 

By the Lebesgue theorem and Assumption A, we can conclude that P 2 (-, B) 
is continuous over the set D\. This is not the case for x G P 2 . However, if 
some sequence (x n ) converges to x with x n G D1UD2, we have P 2 (x n ,P) — > 
P 2 (x,P) by the same arguments. For x n = (xi n ,X2 n ) G P3, we have 

P 2 (x n , B) = P[ip{(uj + ax ln )r] 2 _ 1 ,xi n }rj 2 £B 1 ,(u + ax\ n )rft_ x G P 2 ], 

because ip(x n ) = u + axi n . Therefore, proceeding as for Pi, 

lim P 2 (x„,B) = P[^{(u; + axi)r7 2 _ 1 ,xi}r7 2 G -Bi, 

(cj + axi)^ 2 .! GP 2 ]. 

Setting 

T(x,P) = P^W 2 -!^!}^? e Bl.C^-l G P 2 , 

+ axi)?] 2 „ 1 ,a;i}?] 2 G Pi, (u + axi)?] 2 ^ G P2], 

we define a measure for any x, which is nontrivial because T(x,M +2 ) = 1. 
Setting a(x±) = co + axi, T(x,B) can be decomposed into three probabilities, 
depending on the position of rj 2 _ 1 , as follows: 

P[ujt] 2 G B 1 ,CJ7}1_ 1 G B 2 ,a(xi)r] 2 _ 1 G B 2 ,rjf_ 1 < kxi/a(x{)) 

+ P[ior] 2 G Pi,^! G P 2 , + a(xi)7 ? 2 _ 1 }? ? 2 G Pi, 

a(zi)?7?_i €P 2 ,???_i € [fcxi/a(rBi),fca;i/a;)] 

+ P[w(l + a^Or? 2 G Bi.wij G P 2) 

{w + a^i)?? 2 .!}^ 2 ePi,o(sci)j^_ 1 GP 2 ,t? 2 _! > fesci/a;]. 

This, in view of Assumption A, shows that the function T(-, P) is continuous. 
Finally, P 2 (x,P) > T(x,B) for all x and all P. Thus, T is a continuous 
component of P 2 . The conclusion follows. □ 

Classical ergodicity proofs for nonlinear stochastic difference equations 
(as, for instance, in the case of TAR models, see [16]) rely on verifying a 
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drift condition when the chain goes outside a compact set. In the model 
of this paper, no drift condition holds over the region {x\ < kx 2 }. It is 
therefore necessary to consider more general small sets than compact sets, 
as was done, for instance, by Cline and Pu [6], Theorem 2.5. 

Lemma 3. Under the assumptions of Lemma 1, the set C = {x\ < kx2} 
is small for the Markov chain (Xt). Moreover, the chain is aperiodic. 

Proof. For x = (x±,x 2 ) £ C, we have i/j(x) =lo. Thus, by (3.3), for any 
B = B 1 xB 2 £ B(R+ 2 ) 

P 2 (x,B) =P[^K 2 _ llI1 )i J( 2 6Bi, W i 1 GB 2 ] 

= P[ujn 2 £ Bi,ojt] 2 _ 1 < kx\,ujr] 2 _ 1 £ B2} 

(3.4) 

+ P[(oj + auirj 2 ^)^ 2 € BijUrj 2 ^ > kx\,ujn 2 _ x £ B 2 ] 
:= P 1 (x,B)+P 2 (x,B). 
Let e > 0. For x\ > e, we have 

(3.5) Pi(x,B) > P[urft £ B^uorj 2 ^ < ke,ur)l_ x € B 2 ] :=m(B). 

For x\ <e, we have 

P 2 (x, B) > P[{uj + auirfi_ x )rfi € B u utf t _ x > ke^rj 2 ^ £ B 2 ] := /z 2 (-B). 

The measures fii and fM 2 are clearly nontrivial. It follows that the sets C\ = 
\x\ >e}nC and C 2 = {x\ < e} n C are small. The union of two small sets 
being a small set, we may conclude that C = C\ U C 2 is a small set. 

To prove aperiodicity, we will consider three-step transition probabilities. 
Recall that, for a ^-irreducible Markov chain, the definition of the period d 
is independent of the choice of a small set. For our small set, we choose C\. 
For x £ d and for B = B 1 x B 2 £ B(R+ 2 ), we have, from (3.4) and (3.5), 
after translation of the times, 

P 2 (x, B) > P[uJVt+i G Bi, tor] 2 < ke, lot] 2 £ B 2 ] 

> P[ur]t +1 £ Bi,u)j] 2 _ x < ke, lot] 2 < ke, ujn 2 £ B 2 , rfi < krj 2 ^] 
:=/*(£), 

P 3 (x,B) = P^K 2 .!,^))]^^!}^! £ B h YW-i^i)^ 2 £ B 2 ] 
= P[ip{u)Vt,uVt-x}Vt+i G Bi,^Vt-i < kx!,uji] 2 £ B 2 ] 
+ P[*l>{(w + ^Vt-l)vh^Vt-i}Vt+l £ B x , 
ior] 2 ^ > kxi, (u + awn 2 t _ x )r] 2 t € B 2 ] 

> P[ip{u>vhuVt-l}Vt+i € B 1 ,ujr] 2 _ 1 < ke,u)rft £ B 2 ] 
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>KB). 

The set Ci is then both i^-small and 1/3-small, where ^2 = ^3 = A 4 - This 
measure /i is nontrivial. The greatest common divisor d of the set Ec 1 which 
appears in the definition of periodicity is thus equal to 1. The conclusion 
follows. □ 

3.3. (3-mixing. The main result of this paper is the following theorem: 

Theorem 3.2. Under Assumption A and the condition k > 0, there 
exists a strictly stationary solution (ej) to model (1.2). This solution is (3- 
mixing, and hence strongly mixing, with geometric rate. Moreover, there 
exists r > such that E n (e^ r ) < 00. 

Remark 3. It is worth noting that when k > 0, strict stationarity holds 
regardless of the value of a. When k = 0, that is, in the case of the standard 
ARCH(l), we have the well-known strict stationarity condition established 
by Nelson [14]: < a < exp{-£;(logry t 2 )}. 

Remark 4. Assumption A is crucial for strict stationarity to hold with- 
out an upper bound for a. For instance, in the deterministic case, r/ 2 = 1, 
a.s., Assumption A is not verified and it was seen in Section 3.1 that stability 
requires k > a, or k < a < 1. 

Remark 5. Cline and Pu [7] provided useful conditions for geometric 
ergodicity of a general class of nonlinear AR-ARCH models. We cannot rely 
on their results, however, because in particular their Assumption A. 5 does 
not hold for model (1.2). 

To prove Theorem 3.2, we start by establishing the following lemma: 

Lemma 4. Under the assumptions of Theorem 3.2, the Markov chain 
(Xt) is geometrically ergodic. 

Proof. The conclusion being obvious when a = 0, we consider the case 
a > 0. The proof consists in verifying the three conditions of Theorem 2.1 for 
m = 2. Property (i) holds with cf> = , by Lemma 1, (ii) holds by Lemmas 
2 and 3. To check (hi), we take g(x) = g(x\,X2) = x\, where r G (0, 1]. Let 
fj,2r = E{rft r ) and let [i\ r = E(rjt r l v 2 >k / a ). Note that these quantities are 
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finite under Assumption A. We have 
E[g(X t+2 )\X t = (x 1 ,x 2 )} 
= E[ef +2 \X t = {x l ,x 2 )} 

, 4 =E[r]t T +2 (^{ip(xi,x 2 )r]f +1 ,x 1 }) r ] 
(3.6) 

= fi 2r E[ij{'i(j(xi,x 2 )r]t +1 ,xi}} r 

= (i 2r E{u + a^(x 1 ,x 2 )vf+i^r,f +1 >kx 1 /^(x 1 ,x 2 )Y 

< fi 2r u r + ^ 2r a r il}{xi,x 2 ) r E[rif +1 t n 2 +i>kxi/ ^ x ^ X2) ], 

where the last inequality follows from the elementary inequality (a + b) r < 
a r + b r for any a, b > 0. For x\ > kx 2 , we then have 

E[g(X t+2 )\X t = (x 1 ,x 2 )] < fi 2r u r + fi 2r a r (uj + ax^ Ei^l^^^^^]. 
When x\ — ► +oo, the right-hand side of this inequality is equivalent to 

a ii 2r n 2r x l . 

Now a 2r /j, 2r /j, 2r tends to P[?7t > fc/a] when r — ► 0, by the Lebesgue theorem. 
This probability being strictly less than 1 when A; > (by Assumption A), 
we have a 2r /j, 2r fj, 2r < 1 for r sufficiently small. Therefore, there exist > 0, 
r > and M > such that 

x±>M and x\ > kx 2 =^ E[e^ 2 \ x t = (xx,x 2 )] < (1 — P)x r 1 - (3. 

For x\ < M, we have il)(x\,X2) <lo + aM and hence, from 

E lVt+l'hj> +1 >kx 1 /il)(.x 1 ,X2)] - VZr 

and (3.6), we have 

E[e\\ 2 \X t = {x u x 2 )\ < l*2ru r + l4 r a r {u + aM) r . 

Finally, for x\ < kx 2 , since ip(xi,x 2 ) =to, we have, by (3.6), 

E[£ 2r +2 \X t = (xi,x 2 )] < ii 2r u) r {l + ii 2r a T ) < [i 2r uf + nl r a r {uj + aM) r . 

We can conclude that (hi) holds, with C = [0,M] 2 U {x\ < kx 2 } and b = 
fi 2r Lo r + fi 2r a r (L0 + aM) r . 

That C is a small set is a consequence of Lemma 2 (implying that any 
compact set is small), Lemma 3 and the fact that the union of two small 
sets is small ([13], Proposition 5.5.5). The conclusion follows. □ 

Proof of Theorem 3.2. Since (X t ) is geometrically ergodic, it is 
/3-mixing, with E nr g{X t ) = E^ef' < oo. It follows that (e 2 ) and (a t ) are (3- 
mixing processes. The fact that Et inherits this /3-mixing property follows 
from the independence between at and r\t (see, e.g., [10], proof of Theorem 
3). □ 
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3.4. Existence of moments. Theorem 3.2 ensures the existence of a mo- 
ment of some order 2r. For statistical applications, however, it is often nec- 
essary to assume second order stationarity or the existence of higher order 
moments. The following theorem provides a sufficient condition for the ex- 
istence of 2pth-order moments: 

Theorem 3.3. Let p e N. Under Assumption A, with ^ = Erf p < oo, 

if 

/ k™ 1 ^ 1 \ V( 2 P+m— l) 

(3.7) < a < max ( — -. -j— ] , 

me{l,2,...}\ lJo l-l/ni l/mj 

then there exists a strictly stationary solution process (et) to model (1.2) 
such that E- K {e1 P ) < oo. 

—l/v 

Remark 6. For m = 1, the term inside the brackets reduces to \x 2v ■ 
A simple condition for the existence of E(e 2p ) is thus 

(3.8) fi 2p a p < 1, 

which is also necessary in the standard ARCH(l) case (k = 0). However, the 
example below shows that when k increases, the upper bound in (3.7) is 
attained for integers m > 1. 

Proof of Theorem 3.3. Following the same approach as that used in 
the proof of Lemma 4, but now with g(x) = g{x\,X2) = x p , we get 

E[e 2p +2 \X t = { Xl ,x 2 )] 

= E{rit P 2 ip{'ip(x 1 ,X2)r]t +1 ,x 1 } p \X t ^i = (x 1 ,x 2 )} 

= ^2pE{u + aip(x 1 ,x 2 )rf +1 l v 2 +i>kxiMxuX2) } p 

(3.9) =V2 P J2 (*W~ Sa >(zi^2) s £[r7t+it^ .^toi/^f*!,^)]- 

s=0 ViV 

By the Holder and Markov inequalities, we have, for m > 1, 

kxi xm 



E( V nih hl >k XlMxi , X2) ) < {E(r,^)} l,m \P 



ip(x 1 ,x 2 ] 



(m— l)/m 



< {E(vt+i)} 



2ms n l/ m J E( V %\)j;(x 1 ,X2r \ 



\ (kxi)" 1 



l/m (m— l)/m 



m—1 
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When x\ > kx2 and x\ — > +00, the right-hand side of (3.9) is thus bounded 
by a term which is equivalent to 

2p, 1/m (m-lVm/a" 1 

l/(2p+m-l) l/(m(2p+m-l)) (m-l)/(m(2p+m-l)) 2 p+m-l 

P2p P2mp P2m I p 

" fc(m-l)/(2p+m-l) J X V 

The right-hand side term inside the brackets being, in view of (3.7), strictly 
less than 1 for some m > 1, we thus have 



: P " 2p M2£/4m 1)/m \ t\ 3% < (1 - P)x{ - f3 



for some constant (3 > 0. Therefore, there exists M > such that 

xi>M and x 1 > kx 2 £[e 2 £ 2 |X t = (x 1 ,x 2 )) < (1 - fi)x\ - (3. 

Furthermore, for x\ < M, we have tp(xi,X2) < u; + aM and thus, in view of 
(3.9), 

£[e£ 3 |Xt = (x l5 x 2 )] < » 2p it (i) u p - s a s (Lo + aMy 

s=0 ^ ' 

= M2p{^ + a(u + aM)} p . 
Finally, for x\ < kx2, we have ip(x\,X2) =oj and thus, from (3.9), 

E[efl 2 \X t = (x 1 ,x 2 )} </i 2 p{w + aw} p . 
We can conclude that 

(1 - (3)xl p -P, x G C c , 



E[e1l 2 \X t = (x u x 2 )]<{^ 



xeC, 



for some strictly positive constants and b, with C = [0, M] 2 U {rri < /cx 2 }. 
The theorem follows. □ 

When k < 1, a necessary condition can be straightforwardly obtained as 
follows. Let (e t ) be a strictly stationary solution of model (1.2) with a finite 
2pth moment. Then 

E{e?) > M2 pK + o^{e£i Vi>*?-»>] 

= M 2pK + a^(e 2p ) - cPE^^^J) 
> (i 2p [co p + a p £(e 2p ) - a p k p E{et 2 )}. 

It follows that 

{l-^2 P oP(l-k p )}E(e 2 t p )>^ p . 
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Table 1 

Constraints for the existence of the second order moment (p = 1), for the standard 
normal distribution, as functions of k. The second column gives the value of m for which 

the maximum is attained in (3.7). The third column gives the constraint for a as a 
function of k and the last column gives the maximum value for a when k is equal to the 

upper bound of the interval 



k ma Ctmax 



[0,3[ 


1 


[o,i[ 


1 


[3,6.455[ 


2 


[o,{^} 1/3 [ 


1.291 


[6.455, 12.652[ 


3 


[o,{B 1/4 [ 


1.807 


[12.652, 23.714[ 


4 


[o,{U 1/5 [ 


2.635 


[23.714, 43.297[ 


5 


[m4> i/6 [ 

[o,{£} 1/8 [ 


3.936 


[43.297, 77.694[ 


6 


5.976 


[77.694, 137.715[ 


7 


9.181 



Therefore, a necessary condition for E(e^ p ) < oo is 
(3.10) (i 2p a p (l-k p )<l, 
and we have 

1 - fi 2p a r (l - kP) 

When k = 0, (3.10) coincides with (3.8) and provides the necessary and 
sufficient condition for the existence of E(e\ p ) in the standard ARCH(l) 
case (see [12] for moment conditions for the GARCH(p, q) model). 

Example. In the case of the standard Af{0, 1) distribution for rjt, con- 
dition (3.7) can be made explicit. First, let p = l. We have ^2m = anc ^ 
simple algebra shows that the maximum in (3.7) is attained for 



f /(2m-l) m \ 1 / 2 \ 

mo = mo(fc) = mm <m:k< > 

me{2,3,...} I V fJ>2( m -i) ) > 



•(2m- l) m y/ 2. 

Thus, the second-order stationarity condition is 

- ^ m o-i2 m o mo l \ i/(™o+i) 



< a < 



(2m )! 



For k > 3, values of a that are greater than 1 can be compatible with second- 
order stationarity, as can be seen from Table 1. 

Similar computations can be carried out when p = 2. Table 2 provides 
the fourth-order stationarity constraints, for different ranges of values of 
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Table 2 




As 


in Table 1, but for 


the moment of order 4 (p = 2) 




k 


m 


a 


Q^max 


[0,3.416) 


1 


[0,^7s ) 


0.577 


[3.416,4.579) 


2 


m r fe 4 U/5\ 

[0; { 3/2 1/2 } ) 


0.612 


[4.579,6.373) 


3 


[o,{ £ 1/3 } 1/6 ) 


0.684 


[6.373,8.846) 


4 


[o,{ £ 1/4 } 1/7 ) 


0.787 


[8.846,12.183) 


5 


[o,{ S 1/5 } 1/8 ) 


0.923 


[12.183,16.656) 


6 


[o,{ £ 1/6 } 1/9 ) 

Wl2 ^24 


1.098 


[16.656,22.626) 


7 


[0,{ Sr ^} 1/W ) 

^28 


1.320 


[22.626,30.571) 


8 


[0,{ rfs VB} 1711 ) 


1.599 


[30.571,41.122) 


9 


r r fc 8 U/i2\ 

L u ! 1 8/9 1/9 J / 
Wl8 ^36 


1.948 



k. The values of m corresponding to the maximum in (3.7) have been ob- 
tained numerically. For k < 3.416, the maximum is reached for m = 1 and 
the constraint is that of a standard ARCH(l) (3a 2 < 1). Interestingly, when 
k increases, the maximum is reached for larger values of m (e.g., m = 2 
for 1.763 <k< 1.886) and larger values for a are obtained. It is seen that 
values of a much larger than 1 are compatible with E(ef) < oo when k is 
large. Similar tables can be constructed for any value of p and for other 
distributions. 

The outputs of Tables 1 and 2 are represented in Figure 1. 




0.2S 



10 20 30 40 2 4 6 8 10 12 

Fig. 1. Stationarity regions for model (1.2) with r/t ~JV(0, 1). 1. Existence of Ee\; 2. 
Existence of Ee\ with Ee\ = oo; 3. Strict stationarity with Eel = °°- The right panel is a 
zoom of the left panel. 
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